The semantically annotated corpus of Polish quantificational expressions

نویسندگان

چکیده

Abstract The paper presents a manually annotated corpus of Polish quantificational expressions. quantifier annotation was conducted on top existing gold-standard data for as its separate layer. This releases the and gives an overview related tools. As far we know, this is first large-scale generalized quantifiers together with their crucial semantic properties, including monotonicity profile. We also discuss potential further use in linguistics cognitive science.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Semantically Annotated Swedish Medical Corpus

With the information overload in the life sciences there is an increasing need for annotated corpora, particularly with biological and biomedical entities, which is the driving force for data-driven language processing applications and the empirical approach to language study. Inspired by the work in the GENIA Corpus, which is one of the very few of such corpora, extensively used in the biomedi...

متن کامل

Developing a large semantically annotated corpus

What would be a good method to provide a large collection of semantically annotated texts with formal, deep semantics rather than shallow? We argue that a bootstrapping approach comprising state-of-the-art NLP tools for parsing and semantic interpretation, in combination with a wiki-like interface for collaborative annotation of experts, and a game with a purpose for crowdsourcing, are the star...

متن کامل

YAWN: A Semantically Annotated Wikipedia XML Corpus

The paper presents YAWN, a system to convert the well-known and widely used Wikipedia collection into an XML corpus with semantically rich, self-explaining tags. We introduce algorithms to annotate pages and links with concepts from the WordNet thesaurus. This annotation process exploits categorical information in Wikipedia, which is a high-quality, manually assigned source of information, extr...

متن کامل

A Semantically Annotated Corpus from MEDLINE Abstracts

Automatic information extraction is a key technology to help researchers access the information contained in research papers and to extend databases on substances and biological processes. We aim to build information extraction databases [2] from biochemical papers and their abstracts available from the MEDLINE [3] database. To objectively measure the performance of our systems, we built a corp...

متن کامل

GENIA corpus - a semantically annotated corpus for bio-textmining

MOTIVATION Natural language processing (NLP) methods are regarded as being useful to raise the potential of text mining from biological literature. The lack of an extensively annotated corpus of this literature, however, causes a major bottleneck for applying NLP techniques. GENIA corpus is being developed to provide reference materials to let NLP techniques work for bio-textmining. RESULTS G...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Language Resources and Evaluation

سال: 2022

ISSN: ['1574-020X', '1574-0218']

DOI: https://doi.org/10.1007/s10579-022-09578-4